Regularization Through Feature Knock Out
نویسندگان
چکیده
In this paper, we present and analyze a novel regularization technique based on enhancing our dataset with corrupted copies of the original data. The motivation is that since the learning algorithm lacks information about which parts of the data are reliable, it has to produce more robust classification functions. We then demonstrate how this regularization leads to redundancy in the resulting classifiers, which is somewhat in contrast to the common interpretations of the Occam’s razor principle. Using this framework, we propose a simple addition to the gentle boosting algorithm which enables it to work with only a few examples. We test this new algorithm on a variety of datasets and show convincing results. Copyright c ©Massachusetts Institute of Technology, 2004 This report describes research done at the Center for Biological & Computational Learning, which is in the McGovern Institute for Brain Research at MIT, as well as in the Dept. of Brain & Cognitive Sciences, and which is affiliated with the Computer Sciences & Artificial Intelligence Laboratory (CSAIL). This research was sponsored by grants from: Office of Naval Research (DARPA) Contract No. MDA972-04-1-0037, Office of Naval Research (DARPA) Contract No. N00014-02-1-0915, National Science Foundation (ITR/IM) Contract No. IIS-0085836, National Science Foundation (ITR/SYS) Contract No. IIS-0112991, National Science Foundation (ITR) Contract No. IIS-0209289, National Science Foundation-NIH (CRCNS) Contract No. EIA-0218693, National Science Foundation-NIH (CRCNS) Contract No. EIA-0218506, and National Institutes of Health (Conte) Contract No. 1 P20 MH66239-01A1. Additional support was provided by: Central Research Institute of Electric Power Industry, Center for e-Business (MIT), Daimler-Chrysler AG, Compaq/Digital Equipment Corporation, Eastman Kodak Company, Honda R& D Co., Ltd., ITRI, Komatsu Ltd., Eugene McDermott Foundation, Merrill-Lynch, Mitsubishi Corporation, NEC Fund, Nippon Telegraph & Telephone, Oxygen, Siemens Corporate Research, Inc., Sony MOU, Sumitomo Metal Industries, Toyota Motor Corporation, and WatchVision Co., Ltd.
منابع مشابه
Study of the Economic Nature of the Barrier Options and Its Jurisprudential Analysis
The purpose of this study is to investigate the economic and jurisprudential nature of barrier Option. Options are a type of derivative instrument in the financial markets that gives a person the right to buy or sell an asset without obligation. This tool is used along with other types of derivative tools to cover risk and speculation. Two kindes of barrier option are the Knock-In and Knock-out...
متن کاملFeature Scaling for Kernel Fisher Discriminant Analysis Using Leave-One-Out Cross Validation
Kernel fisher discriminant analysis (KFD) is a successful approach to classification. It is well known that the key challenge in KFD lies in the selection of free parameters such as kernel parameters and regularization parameters. Here we focus on the feature-scaling kernel where each feature individually associates with a scaling factor. A novel algorithm, named FS-KFD, is developed to tune th...
متن کاملFrom Transformation-Based Dimensionality Reduction to Feature Selection
Many learning applications are characterized by high dimensions. Usually not all of these dimensions are relevant and some are redundant. There are two main approaches to reduce dimensionality: feature selection and feature transformation. When one wishes to keep the original meaning of the features, feature selection is desired. Feature selection and transformation are typically presented sepa...
متن کاملNearest Neighbor Based Feature Selection for Regression and its Application to Neural Activity
We present a non-linear, simple, yet effective, feature subset selection method for regression and use it in analyzing cortical neural activity. Our algorithm involves a feature-weighted version of the k-nearest-neighbor algorithm. It is able to capture complex dependency of the target function on its input and makes use of the leave-one-out error as a natural regularization. We explain the cha...
متن کاملStationarity of Matrix Relevance Learning Vector Quantization
We investigate the convergence properties of heuristic matrix relevance updates in Learning Vector Quantization. Under mild assumptions on the training process, stationarity conditions can be worked out which characterize the outcome of training in terms of the relevance matrix. It is shown that the original training schemes single out one specific direction in feature space which depends on th...
متن کامل